{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "4260a150",
   "metadata": {},
   "source": [
    "# Model Card for DistilBERT base multilingual (cased)\n",
    "\n",
    "You can get more details from [Bert in PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/bert/README.md)。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "53f1b1c2",
   "metadata": {},
   "source": [
    "This model is a distilled version of the [BERT base multilingual model](https://huggingface.co/bert-base-multilingual-cased/). The code for the distillation process can be found [here](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation). This model is cased: it does make a difference between english and English.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f417583b",
   "metadata": {},
   "source": [
    "The model is trained on the concatenation of Wikipedia in 104 different languages listed [here](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages).\n",
    "The model has 6 layers, 768 dimension and 12 heads, totalizing 134M parameters (compared to 177M parameters for mBERT-base).\n",
    "On average, this model, referred to as DistilmBERT, is twice as fast as mBERT-base.\n",
    "\n",
    "- **Developed by:** Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf (Hugging Face)\n",
    "- **Model type:** Transformer-based language model\n",
    "- **Language(s) (NLP):** 104 languages; see full list [here](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages)\n",
    "- **License:** Apache 2.0\n",
    "- **Related Models:** [BERT base multilingual model](https://huggingface.co/bert-base-multilingual-cased)\n",
    "- **Resources for more information:**\n",
    "- [GitHub Repository](https://github.com/huggingface/transformers/blob/main/examples/research_projects/distillation/README.md)\n",
    "- [Associated Paper](https://arxiv.org/abs/1910.01108)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f47ce9b7",
   "metadata": {},
   "source": [
    "## How to use"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a1353b5f",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install --upgrade paddlenlp"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e23a860f",
   "metadata": {},
   "outputs": [],
   "source": [
    "import paddle\n",
    "from paddlenlp.transformers import AutoModel\n",
    "\n",
    "model = AutoModel.from_pretrained(\"distilbert-base-multilingual-cased\")\n",
    "input_ids = paddle.randint(100, 200, shape=[1, 20])\n",
    "print(model(input_ids))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "38c30ea4",
   "metadata": {},
   "source": [
    "# Citation\n",
    "\n",
    "```\n",
    "@article{Sanh2019DistilBERTAD,\n",
    "  title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},\n",
    "  author={Victor Sanh and Lysandre Debut and Julien Chaumond and Thomas Wolf},\n",
    "  journal={ArXiv},\n",
    "  year={2019},\n",
    "  volume={abs/1910.01108}\n",
    "}\n",
    "```\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0ee03d6a",
   "metadata": {},
   "source": [
    "> The model introduction and model weights originate from  https://huggingface.co/distilbert-base-multilingual-cased  and were converted to PaddlePaddle format for ease of use in PaddleNLP.\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
